March 15, 2017
Reason 1: It's free
Reason 2: It's "open source"
Reason 3: It's beautiful
Reason 3: It's beautiful
Reason 3: It's beautiful
Reason 3: It's beautiful
Reason 4: It's powerful
Reason 5: It's fun
Download R: https://www.r-project.org/
Download RStudio: https://www.rstudio.com/products/rstudio/download/
Let's write some code!
2 + 2
Let's write some code!
2 + 2
[1] 4
Let's write some code!
x <- c(1,2,3,4,5)
Let's write some code!
x
[1] 1 2 3 4 5
Let's write some code!
barplot(x)
A "package" is simply a collection of code written by someone else.
It's what makes R powerful, but also confusing.
You only have to install a package one time.
install.packages('dplyr')
install.packages('devtools')
devtools::install_github('databrew/databrew')
devtools::install_github('joebrew/cism')
You have to use the library function every time you use a package.
library(databrew) library(cism) library(sp)
Writing library just means "I am going to use this package".
Since we've already written library(cism), now we can use some tools from the cism package.
plot(moz0)
plot(man3)
a <- 1 a + 3
a <- 1 a + 3
[1] 4
Let's create an object called "ages", with the age of everyone
ages <- c()
How do we view our ages object?
ages
How do we view our ages object?
ages
[1] 30 26 31 39 45 27 28 22 19 30 35
How do we view just the first element of our ages object?
ages[1]
How do we view just the first element of our ages object?
ages[1]
[1] 30
How do we sort our ages object?
sorted_ages <- sort(ages)
sorted_ages
[1] 19 22 26 27 28 30 30 31 35 39 45
How do we get the minimum, maximum, average age?
min(ages) max(ages) mean(ages)
min(ages)
[1] 19
max(ages)
[1] 45
mean(ages)
[1] 30.18182
How do we visualize our ages object?
hist(ages)
Always save your scripts.
Never save your "workspace".
Work in "projects"
We're going to use the cism package to get weather data for the FQMA weather station (Maputo).
library(cism)
??get_weather
weather <- get_weather(station = 'FQMA',
start_year = 2010,
end_year = 2016)
Now that we have our weather data, we can look at it.
head(weather)
Now that we have our weather data, we can look at it.
head(weather)
date temp_max temp_mean temp_min humidity_max humidity_mean 1 2010-01-01 34 30 26 94 66 2 2010-01-02 31 27 24 89 72 3 2010-01-03 32 28 24 94 79 4 2010-01-04 31 26 21 100 84 5 2010-01-05 25 23 21 100 82 6 2010-01-06 28 24 20 83 69 humidity_min precipitation cloud_cover location 1 52 0 2 FQMA 2 55 0 5 FQMA 3 55 0 6 FQMA 4 58 0 6 FQMA 5 65 0 6 FQMA 6 54 0 3 FQMA
# 1. How many rows are in our data? nrow(weather) # 2. How many columns? ncol(weather) # 3. What are the names of the columns? colnames(weather)
# 1. How many rows are in our data? nrow(weather)
[1] 2302
# 2. How many columns? ncol(weather)
[1] 10
# 3. What are the names of the columns? colnames(weather)
[1] "date" "temp_max" "temp_mean" "temp_min" [5] "humidity_max" "humidity_mean" "humidity_min" "precipitation" [9] "cloud_cover" "location"
# 4. What is the date range? range(weather$date) # 5. What is the maximum temperature? max(weather$temp_max) # 6. What is the minimum temperature? min(weather$temp_min) # 7. What is the average temperature? mean(weather$temp_mean)
# 4. What is the date range? range(weather$date)
[1] "2010-01-01" "2016-12-12"
# 5. What is the maximum temperature? max(weather$temp_max, na.rm = TRUE)
[1] 44
# 6. What is the minimum temperature? min(weather$temp_min, na.rm = TRUE)
[1] 7
# 7. What is the average temperature? mean(weather$temp_mean, na.rm = TRUE)
[1] 23.84982
Which variables do we have which are numeric and continuous?
How can we visualize these?
Which variables do we have which are numeric and continuous?
temp_max, temp_mean, temp_min, etc…How can we visualize these?
boxplot(weather$temp_mean)
hist(weather$temp_mean)
Let's create a variable called "hot"
weather$hot <- ifelse(weather$temp_max > 30, 'hot', 'not hot')
head(weather)
head(weather)
date temp_max temp_mean temp_min humidity_max humidity_mean 1 2010-01-01 34 30 26 94 66 2 2010-01-02 31 27 24 89 72 3 2010-01-03 32 28 24 94 79 4 2010-01-04 31 26 21 100 84 5 2010-01-05 25 23 21 100 82 6 2010-01-06 28 24 20 83 69 humidity_min precipitation cloud_cover location hot 1 52 0 2 FQMA hot 2 55 0 5 FQMA hot 3 55 0 6 FQMA hot 4 58 0 6 FQMA hot 5 65 0 6 FQMA not hot 6 54 0 3 FQMA not hot
table(weather$hot) hot_table <- table(weather$hot) hot_prop_table <- prop.table(hot_table)
hot_table <- table(weather$hot) hot_prop_table <- prop.table(hot_table) barplot(hot_table)
barplot(hot_table,
main = 'Hot days in Maputo')
barplot(hot_table,
main = 'Hot days in Maputo',
ylab = 'Number of days')
barplot(hot_table,
main = 'Hot days in Maputo',
ylab = 'Number of days',
xlab = 'Temperature')
barplot(hot_table,
main = 'Hot days in Maputo',
ylab = 'Number of days',
xlab = 'Temperature',
col = c('red', 'blue'))
barplot(hot_table,
main = 'Hot days in Maputo',
ylab = 'Number of days',
xlab = 'Temperature',
col = c('red', 'blue'),
border = 'darkgrey')
Let's create a plot of date (x-axis) and the maximum temperature
Let's create a plot of date (x-axis) and the maximum temperature
plot(weather$date,
weather$temp_max)
Let's make our plot prettier
Let's make our plot prettier
plot(weather$date,
weather$temp_max,
type = 'l',
col = 'red',
xlab = 'Date',
ylab = 'Maximum temperature',
main = 'Maximim temperature in Maputo')
We're going to analyze where Joe is, using data from google. The data is part of the databrew package.
# Load package library(databrew) # Get data joe <- joe
Let's have a look at the structure of our data.
head(joe)
head(joe)
date time longitude latitude velocity altitude 1 2017-03-13 2017-03-13 11:08:06 32.79699 -25.40760 NA NA 2 2017-03-13 2017-03-13 11:06:01 32.79699 -25.40760 NA NA 3 2017-03-13 2017-03-13 11:05:32 32.80439 -25.40608 NA NA 4 2017-03-13 2017-03-13 11:03:03 32.80439 -25.40608 NA NA 5 2017-03-13 2017-03-13 11:01:03 32.80545 -25.40844 NA NA 6 2017-03-13 2017-03-13 11:00:16 32.80545 -25.40779 NA NA heading accuracy 1 NA 2500 2 NA 2500 3 NA 1899 4 NA 1899 5 NA 400 6 NA 699
Let's filter our data so that it only contains observations for the period from March 7-13.
joe_filtered <- joe[joe$date >= '2017-03-07' &
joe$date <= '2017-03-13',]
Now let's use the cism package to plot Manhiça.
library(cism) library(sp) manhica <- man3 plot(manhica)
The databrew package has a nice function called visualize_location. Let's try it out
?visualize_location
visualize_location(x = joe_filtered,
spdf = manhica)
Let's also try with an interactive map
visualize_location(x = joe_filtered,
use_leaflet = TRUE)